Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.
translated by 谷歌翻译
Machine Learning models capable of handling the large datasets collected in the financial world can often become black boxes expensive to run. The quantum computing paradigm suggests new optimization techniques, that combined with classical algorithms, may deliver competitive, faster and more interpretable models. In this work we propose a quantum-enhanced machine learning solution for the prediction of credit rating downgrades, also known as fallen-angels forecasting in the financial risk management field. We implement this solution on a neutral atom Quantum Processing Unit with up to 60 qubits on a real-life dataset. We report competitive performances against the state-of-the-art Random Forest benchmark whilst our model achieves better interpretability and comparable training times. We examine how to improve performance in the near-term validating our ideas with Tensor Networks-based numerical simulations.
translated by 谷歌翻译
在本文中,我们考虑了使用嘈杂的中间量子量子(NISQ)设备的几种用于量子计算机视觉的算法,并将它们基于对其经典对应物的真正问题进行基准测试。具体而言,我们考虑了两种方法:基于通用门的量子计算机上的量子支持向量机(QSVM),以及Qubost在量子退火器上。量子视觉系统是针对图像不平衡数据集进行基准测试的,其目的是检测制成的汽车件中的缺陷。我们看到,量子算法以几种方式优于其经典对应物,QBoost允许使用当今的量子退火器分析更大的问题。还讨论了数据预处理,包括降低维度和对比度增强,以及Qboost中的超参数调整。据我们所知,这是量子计算机视觉系统的首次实施,用于制造生产线中的工业相关性问题。
translated by 谷歌翻译
部分微分方程(PDE)用于对科学和工程中的各种动力系统进行建模。深度学习的最新进展使我们能够以新的方式解决维度的诅咒,从而在更高的维度中解决它们。但是,深度学习方法受到训练时间和记忆的约束。为了解决这些缺点,我们实施了张量神经网络(TNN),这是一种量子启发的神经网络体系结构,利用张量网络的想法来改进深度学习方法。我们证明,与经典密集神经网络(DNN)相比,TNN提供了明显的参数节省,同时获得了与经典密集的神经网络相同的准确性。此外,我们还展示了如何以相同的精度来比DNN更快地训练TNN。我们通过将它们应用于求解抛物线PDE,特别是Black-Scholes-Barenblatt方程,该方程广泛用于金融定价理论,基于基准测试。还讨论了进一步的例子,例如汉密尔顿 - 雅各比 - 贝尔曼方程。
translated by 谷歌翻译
在这里,我们提出了一种基于变异量子电路聚类数据的量子算法。该算法允许将数据分类为许多群集,并且可以轻松地以几量噪声中间尺度量子(NISQ)设备实现。该算法的概念依赖于将聚类问题减少到优化,然后通过差异量子eigensolver(VQE)与非正交量子符号状态相结合。实际上,该方法使用目标希尔伯特空间的最大正交状态,而不是通常的计算基础,即使很少有Qubits,也可以考虑大量簇。我们使用实际数据集对数值模拟进行基准测试算法,即使有一个单个量子,也显示出出色的性能。此外,通过构造,量子化算法的张量网络模拟可以在当前经典硬件上运行的量子启发的聚类算法。
translated by 谷歌翻译
We introduce the Conditional Independence Regression CovariancE (CIRCE), a measure of conditional independence for multivariate continuous-valued variables. CIRCE applies as a regularizer in settings where we wish to learn neural features $\varphi(X)$ of data $X$ to estimate a target $Y$, while being conditionally independent of a distractor $Z$ given $Y$. Both $Z$ and $Y$ are assumed to be continuous-valued but relatively low dimensional, whereas $X$ and its features may be complex and high dimensional. Relevant settings include domain-invariant learning, fairness, and causal learning. The procedure requires just a single ridge regression from $Y$ to kernelized features of $Z$, which can be done in advance. It is then only necessary to enforce independence of $\varphi(X)$ from residuals of this regression, which is possible with attractive estimation properties and consistency guarantees. By contrast, earlier measures of conditional feature dependence require multiple regressions for each step of feature learning, resulting in more severe bias and variance, and greater computational cost. When sufficiently rich features are used, we establish that CIRCE is zero if and only if $\varphi(X) \perp \!\!\! \perp Z \mid Y$. In experiments, we show superior performance to previous methods on challenging benchmarks, including learning conditionally invariant image features.
translated by 谷歌翻译
Nowadays, copy detection patterns (CDP) appear as a very promising anti-counterfeiting technology for physical object protection. However, the advent of deep learning as a powerful attacking tool has shown that the general authentication schemes are unable to compete and fail against such attacks. In this paper, we propose a new mathematical model of printing-imaging channel for the authentication of CDP together with a new detection scheme based on it. The results show that even deep learning created copy fakes unknown at the training stage can be reliably authenticated based on the proposed approach and using only digital references of CDP during authentication.
translated by 谷歌翻译
Static subword tokenization algorithms have been an essential component of recent works on language modeling. However, their static nature results in important flaws that degrade the models' downstream performance and robustness. In this work, we propose MANTa, a Module for Adaptive Neural TokenizAtion. MANTa is a differentiable tokenizer trained end-to-end with the language model. The resulting system offers a trade-off between the expressiveness of byte-level models and the speed of models trained using subword tokenization. In addition, our tokenizer is highly explainable since it produces an explicit segmentation of sequences into blocks. We evaluate our pre-trained model on several English datasets from different domains as well as on synthetic noise. We find that MANTa improves robustness to character perturbations and out-of-domain data. We then show that MANTa performs comparably to other models on the general-domain GLUE benchmark. Finally, we show that it is considerably faster than strictly byte-level models.
translated by 谷歌翻译
This paper presents the development of an AI-based language learning platform Revita. It is a freely available intelligent online tutor, developed to support learners of multiple languages, from low-intermediate to advanced levels. It has been in pilot use by hundreds of students at several universities, whose feedback and needs are shaping the development. One of the main emerging features of Revita is the introduction of a system of linguistic constructs as the representation of domain knowledge. The system of constructs is developed in close collaboration with experts in language teaching. Constructs define the types of exercises, the content of the feedback, and enable the detailed modeling and evaluation of learning progress.
translated by 谷歌翻译
In a typical car-following scenario, target vehicle speed fluctuations act as an external disturbance to the host vehicle and in turn affect its energy consumption. To control a host vehicle in an energy-efficient manner using model predictive control (MPC), and moreover, enhance the performance of an ecological adaptive cruise control (EACC) strategy, forecasting the future velocities of a target vehicle is essential. For this purpose, a deep recurrent neural network-based vehicle speed prediction using long-short term memory (LSTM) and gated recurrent units (GRU) is studied in this work. Besides these, the physics-based constant velocity (CV) and constant acceleration (CA) models are discussed. The sequential time series data for training (e.g. speed trajectories of the target and its preceding vehicles obtained through vehicle-to-vehicle (V2V) communication, road speed limits, traffic light current and future phases collected using vehicle-to-infrastructure (V2I) communication) is gathered from both urban and highway networks created in the microscopic traffic simulator SUMO. The proposed speed prediction models are evaluated for long-term predictions (up to 10 s) of target vehicle future velocities. Moreover, the results revealed that the LSTM-based speed predictor outperformed other models in terms of achieving better prediction accuracy on unseen test datasets, and thereby showcasing better generalization ability. Furthermore, the performance of EACC-equipped host car on the predicted velocities is evaluated, and its energy-saving benefits for different prediction horizons are presented.
translated by 谷歌翻译